Attribute grammars for unranked trees as a query language for structured documents

نویسنده

  • Frank Neven
چکیده

Document specification languages, like for instance XML, model documents using extended context-free grammars. These differ from standard context-free grammars in that they allow arbitrary regular expressions on the right-hand side of productions. To query such documents, we introduce a new form of attribute grammars (extended AGs) that work directly over extended context-free grammars rather than over standard context-free grammars. Viewed as a query language, extended AGs are particularly relevant as they can take into account the inherent order of the children of a node in a document. We show that non-circularity remains decidable in EXPTIME and establish the complexity of the non-emptiness and equivalence problem of extended AGs to be complete for EXPTIME. As an application we show that the Region Algebra expressions can be efficiently translated into extended AGs. This translation drastically improves the known upper bound on the complexity of the emptiness and equivalence test for Region Algebra expressions from non-elementary to EXPTIME. Finally, we characterize the expressiveness of extended AGs in terms of monadic second-order logic. A preliminary version of this paper entitled Extensions of attribute grammars for structured document queries was presented at the 7th International Workshop on Database Programming Languages, Kinloch Rannoch, Scotland, 1999. Limburgs Universitair Centrum, Universitaire Campus, Dept. WNI, Infolab, B-3590 Diepenbeek, Belgium. E-mail: [email protected]. Phone: +32-(0)11-26.82.17. Fax: +32-(0)11-26.82.99.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Extensions of Attribute Grammars for Structured Document Queries

Widely-used document speciication languages like, e.g., SGML and XML, model documents using extended context-free grammars. These diier from standard context-free grammars in that they allow arbitrary regular expressions on the right-hand side of productions. To query such documents, we introduce a new form of attribute grammars (extended AGs) that work directly over extended context-free gramm...

متن کامل

Query Evaluation on Compressed Trees

This paper studies the problem of evaluating unary (or nodeselecting) queries on unranked trees compressed in a natural structure-preserving way, by the sharing of common subtrees. The motivation to study unary queries on unranked trees comes from the database field, where querying XML documents, which can be considered as unranked labelled trees, is an important task. We give algorithms and co...

متن کامل

Logical Definability and Query Languages over Unranked Trees

Unranked trees, that is, trees with no restriction on the number of children of nodes, have recently attracted much attention, primarily as an abstraction of XML documents. In this paper, we study logical definability over unranked trees, as well as collections of unranked trees, that can be viewed as databases of XML documents. The traditional approach to definability is to view each tree as a...

متن کامل

SQL-AG: Querying structured documents using attribute grammars

Structured documents, such as program source texts, technical documentation, or XML data, comprise an important class of data in many applications. Structured documents are distinguished from flat text by their tree structure. In a program source text, this structure is the abstract syntax tree of the program. In a technical document, this structure is the division in chapters, sections, paragr...

متن کامل

Connected Component Based Word Spotting on Persian Handwritten image documents

Word spotting is to make searchable unindexed image documents by locating word/words in a doc-ument image, given a query word. This problem is challenging, mainly due to the large numberof word classes with very small inter-class and substantial intra-class distances. In this paper, asegmentation-based word spotting method is presented for multi-writer Persian handwritten doc-...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • J. Comput. Syst. Sci.

دوره 70  شماره 

صفحات  -

تاریخ انتشار 2005